Search CORE

57 research outputs found

An application of distributional semantics for the analysis of the Holy Quran

Author: Benotto Giulia
Giovannetti Emiliano
NAHLI OUAFAE
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

In this contribution we illustrate the methodology and the results of an experiment we conducted by applying Distributional Semantics Models to the analysis of the Holy Quran. Our aim was to gather information on the potential differences in meanings that the same words might take on when used in Modern Standard Arabic w.r.t. their usage in the Quran. To do so we used the Penn Arabic Treebank as a contrastive corpu

Archivio della ricerca- Università di Roma La Sapienza

Towards a flexible open-source software library for multi-layered scholarly textual studies: An Arabic case study dealing with semi-automatic language processing

Author: Del Grosso Angelo Mario
NAHLI OUAFAE
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

This paper presents both the general model and a case study of the Computational and Collaborative Philology Library (CoPhiLib), an ongoing initiative underway at the Institute for Computational Linguistics (ILC) of the National Research Council (CNR), Pisa, Italy. The library, designed and organized as a reusable, abstract and open-source software component, aims at solving the needs of multi-lingual and cross-lingual analysis by exposing common Application Programming Interfaces (APIs). The core modules, coded by the Java programming language, constitute the groundwork of a Web platform designed to deal with textual scholarly needs. The Web application, implemented according to the Java Enterprise specifications, focuses on multi-layered analysis for the study of literary documents and related multimedia sources. This ambitious challenge seeks to obtain the management of textual resources, on the one hand by abstracting from current language, on the other hand by decoupling from the specific requirements of single projects. This goal is achieved thanks to methodologies declared by the 'agile process', and by putting into effect suitable use case modeling, design patterns, and component-based architectures. The reusability and flexibility of the system have been tested on an Arabic case study: the system allows users to choose the morphological engine (such as AraMorph or Al-Khalil), along with linguistic granularity (i.e. with or without declension). Finally, the application enables the construction of annotated resources for further statistical engines (training set). © 2014 IEEE

Archivio della ricerca- Università di Roma La Sapienza

Motore morfologico della lingua araba

Author: Nahli Ouafae
Sassolini Eva
Publication venue
Publication date
Field of study

The morphological engine has been designed to perform the double function: generate automatically, from one Arabic entry, all its forms (including the their morpho-syntactic classification); allow the morphological analysis, that is go back from one form to the dictionary entry (or entries

PUblication MAnagement

Commerce Numérique: traffic signals for crossroads between cultures.

Author: Antonietta Sanna
Federico Boschetti
Michela Bandini
Ouafae Nahli
Publication venue
Publication date: 01/01/2021
Field of study

Commerce is a French literary journal - founded by Princess Margherita Caetani - which relied on the collaboration of three prestigious writers: Paul Valéry, Léon-Paul Fargue, and Valéry Larbaud. The journal is composed of twenty-nine volumes published between 1924 and 1932. Each volume includes different literary material like poems and novels, written by both well- known and unknown writers, who also translated important authors like Joyce, T.S. Eliot, Pirandello, Ungaretti, Saint-John Perse, Rilke, and Hofmannsthal. Considering the historical, literary, and cultural importance of the journal Commerce, our project “Commerce numérique” aims to digitize and to make the journal contents freely available online to both the general public and the research community. This article describes the way in which the journal was encoded. Particular importance is also given to the encoding of poems present in Commerce. Some poems are in the original language and are accompanied by their French translation, other poems are in the French-translated form without the original text. In order to fully and accurately express the phenomena and their structures, we adopted some aspects of the TEI framework that will be explained in detail. The French translation of a Moroccan Arabic poem from the 13th century is also considered. The original Arabic poem is interesting because it presents aspects of both the Moroccan dialect and the oral text. The study and the encoding of the Arabic poem in parallel to its translation highlight some important structural differences between Arabic poetry and Western poetry

Archivio della Ricerca - Università di Pisa

Enseigner l’« histoire des religions » Que faire de l’Antiquité ?

Author: Khalfi Mustapha
Nahli Ouafae
Zarghili Arsalane
Publication venue: Anabases
Publication date: 24/05/2012
Field of study

Enseigner non pas les religions – encore moins « transmettre les croyances religieuses » –, mais leur histoire, en relation avec les contextes, les milieux, les grands événements et personnages, les cultures : donc une histoire à espaces distants et à temporalités différenciées. Sur ce point, l’accord est aujourd’hui à peu près général , mais ce consensus laisse bien des problèmes en suspens, ainsi que le prouve un double débat entre les spécialistes (et les « gens intéressés ») sur la défini..

ILC4CLARIN: Linguistic Data and NLP Tool

OpenEdition

AraMorph Data Plus

Author: Nahli Ouafae
Publication venue: Istituto di Linguistica Computazionale “A. Zampolli” - Consiglio Nazionale delle Ricerche (ILC-CNR)
Publication date: 28/09/2018
Field of study

The AraMorph's original engine (https://sourceforge.net/projects/aramorph/files/aramorph/1.2.1) uses six linguistic files. Three Arabic-English lexicon files: prefixes (299 entries), suffixes (618 entries), and stems (82158 entries representing 38600 lemmas). Other three files consist of morphological compatibility tables used for controlling prefix-stem combinations (1648 entries), stem-suffix combinations (1285 entries), and prefix-suffix combinations (598 entries). The present data consists of the updated lexical resources used by the Aramorph' engine. The updates take advantage of a number of orthographic, morpho-syntactic and semantic constraints that operate at the word level. Therefore, the Arabic-English lexicon files contain: prefixes (335 entries), suffixes (876 entries), and stems (35475 entries). Note that the number of stems is smaller in Plus than in Original, due to the removal of obsolete entries and of a number of foreign names that are unlikely to be found in Arabic texts. The morphological compatibility tables used for controlling prefix-stem combinations (2698 entries), stem-suffix combinations (2161 entries), and prefix-suffix combinations (1295 entries)

ILC4CLARIN: Linguistic Data and NLP Tool

Vers une ontologie de la culture arabo-musulmane

Author: Nahli Ouafae
Publication venue
Publication date: 13/02/2018
Field of study

Le projet « Vers une ontologie de la culture arabo-musulmane » vise à créer une ressource sémantique-lexicale numérique grâce à l'extraction automatique des données à partir du lexique arabe al-qāmūs al-muḥīṭ (qāmūs) compilé par ’al-fīrūz’ābādī (1329-1414). Avec les attributs MonolingualExternalRef, chaque lemme de la ressource numérisée qāmūs sera lié au synset correspondant de la WordNet de la langue anglaise (PWN) et au concept de l'ontologie SUMO (chaque fois que c'est possible)

Archivio della ricerca- Università di Roma La Sapienza

al-qāmūs l-muḥīṭ: a digital Arabic dictionary: letter jīm

Author: Khalfi Mustapha
Nahli Ouafae
Zarghili Arsalane
Publication venue: Laboratory of Intelligent Systems and Applications, Faculty of Sciences and Technology, B.P. 2202, Imouzzer road Fez, Morocco
Publication date: 20/12/2019
Field of study

Dossier letter jīm contains: TXT file: part of plain text corresponding of the section of the letter jīm XML files without translation: conversion of text into XML resulting from information extraction and tagging of lemma, part of speech, lexical information, derivational information, and meanings. XML files with translation: enriched with translations of lemmas and corresponding senses using the bilingual dictionary ''An Advanced Learner's Arabic-English Dictionary", by H. Anthony Salmoné and published in 1889. This section contains: 28 chapters, 461 roots and 1921 lexical entrie

ILC4CLARIN: Linguistic Data and NLP Tool

al-qāmūs l-muḥīṭ: a digital Arabic dictionary: letter wāw

Author: Khalfi Mustapha
Nahli Ouafae
Zarghili Arsalane
Publication venue: Laboratory of Intelligent Systems and Applications, Faculty of Sciences and Technology, B.P. 2202, Imouzzer road Fez, Morocco
Publication date: 20/12/2019
Field of study

Dossier letter wāw contains: TXT file: part of plain text corresponding of the section of the letter wāw XML files without translation: conversion of text into XML resulting from information extraction and tagging of lemma, part of speech, lexical information, derivational information, and meanings. XML files with translation: enriched with translations of lemmas and corresponding senses using the bilingual dictionary ''An Advanced Learner's Arabic-English Dictionary", by H. Anthony Salmoné and published in 1889. This section contains: 28 chapters, 374 roots and 2877 lexical entrie

ILC4CLARIN: Linguistic Data and NLP Tool

al-qāmūs l-muḥīṭ: a digital Arabic dictionary: letter zāy

Author: Khalfi Mustapha
Nahli Ouafae
Zarghili Arsalane
Publication venue: Laboratory of Intelligent Systems and Applications, Faculty of Sciences and Technology, B.P. 2202, Imouzzer road Fez, Morocco
Publication date: 20/12/2019
Field of study

Dossier letter zāy contains: TXT file: part of plain text corresponding of the section of the letter zāy XML files without translation: conversion of text into XML resulting from information extraction and tagging of lemma, part of speech, lexical information, derivational information, and meanings. XML files with translation: enriched with translations of lemmas and corresponding senses using the bilingual dictionary ''An Advanced Learner's Arabic-English Dictionary", by H. Anthony Salmoné and published in 1889. This section contains: 24 chapters, 284 roots and 1450 lexical entrie

ILC4CLARIN: Linguistic Data and NLP Tool